102 research outputs found

    An Investigation of the Articulatory Correlates of Vowel Anteriority in Kazakh, Kyrgyz, and Turkish using Ultrasound Tongue Imaging

    Get PDF
    This paper presents an articulatory study of vowel production in three Turkic languages (Kazakh, Kyrgyz, and Turkish) using ultrasound tongue imaging in order to determine what aspects of tongue position correspond to the vowel anteriority contrasts in these languages, especially regarding the tongue body (TB) and tongue root (TR). The results of this study suggest that the Turkish vowel anteriority contrast involves mainly TB position, whereas the Kazakh and Kyrgyz vowel anteriority contrasts involve both TR and TB position. This latter pattern appears to confirm the existence of a type of vowel anteriority contrast whose existence has been hypothesised but not previously verified instrumentally

    Multi-Script Morphological Transducers And Transcribers For Seven Turkic Languages

    Get PDF
    This paper describes ongoing work to augment morphological transducers for seven Turkic languages with support for multiple scripts each, as well as respective IPA transcription systems. Evaluation demonstrates that our approach yields coverage equivalent to or not much lower than that of the base transducers

    UD Annotatrix: An Annotation Tool For Universal Dependencies

    Get PDF
    In this paper we introduce the UD Annotatrix annotation tool for manual annotation of Universal Dependencies. This tool has been designed with the aim that it should be tailored to the needs of the Universal Dependencies (UD) community, including that it should operate in fully-offline mode, and is freely-available under the GNU GPL licence. In this paper, we provide some background to the tool, an overview of its development, and background on how it works. We compare it with some other widely-used tools which are used for Universal Dependencies annotation, describe some features unique to UD Annotatrix, and finally outline some avenues for future work and provide a few concluding remarks

    Turkic Nasal Harmony as Surface Correspondence

    Get PDF
    Turkic languages are well known for syllable contact phenomena – sonority-driven processes where suffix-initial sonorants surface as obstruents in certain environments. These alternations interact with nasal harmony, a less studied phenomenon where underlying stops and nasals surface as nasals between two nasals. Nasal harmony is attested in about ten Turkic languages (Shor, various Khakas varieties, northern and southern Altay varieties, Kazakh, Qaraqalpaq, Noghay, possibly Karachay-Balkar, and Kazan and Siberian Tatar varieties), and it varies in its scope and how it interacts with syllable contact phenomena. In this paper, we provide a detailed description of nasal harmony in Kazakh, which has one of the richest nasal harmony systems, and explore an analysis within Surface Correspondence Theory

    Delineating Turkic non-finite verb forms by syntactic function

    Get PDF
    In this paper, we argue against the primary categories of non-finite verb used in the Turkology literature: “participle” (причастие ‹pričastije›) and “converb” (деепричастие ‹dejepričastije›). We argue that both of these terms conflate several discrete phenomena, and that they furthermore are not coherent as umbrella terms for these phenomena. Based on detailed study of the non-finite verb morphology and syntax of a wide range of Turkic languages (presented here are Turkish, Kazakh, Kyrgyz, Tatar, Tuvan, and Sakha), we instead propose delineation of these categories according to their morphological and syntactic properties. Specifically, we propose that more accurate categories are verbal noun, verbal adjective, verbal adverb, and infinitive. This approach has far-reaching implications to the study of syntactic phenomena in Turkic languages, including phenomena ranging from relative clauses to clause chaining

    A phonetic study of length and duration in Kyrgyz vowels

    Get PDF
    This paper examines the phonetic correlates of the (phonological) vowel length contrast in Kyrgyz to address a range of questions about the nature of this contrast, and also explores factors that affect (phonetic) duration in short vowels. Measurement and analysis of the vowels confirms that there is indeed a significant duration distinction between the Kyrgyz vowel categories referred to as short and long vowels. Preliminary midpoint formant measurements show that there may be some accompanying spectral component to the length contrast for certain vowels, but findings are not conclusive. A comparison of F0 dynamics and spectral dynamics through long and short vowels does not yield evidence that some long vowels may in fact be two heterosyllabic short vowels. Analysis shows that duration is associated with a vowel’s presence in word-edge syllables in Kyrgyz, as anticipated based on descriptions of word-final stress and initial prominence. However, high vowels and non-high vowels are found to consistently exhibit opposite durational effects. Specifically, high vowels in word-edge syllables are longer than high vowels in medial syllables, while non-high vowels in word-edge syllables are shorter than non-high vowels in medial syllables. This suggests either a phenomenon of durational neutralisation at word edges or the exaggeration of durational differences word-medially, and is not taken as a case of word-edge strengthening. Proposals for how to select from between these hypotheses in future work are discussed

    A Phonetic Study Of Length And Duration In Kyrgyz Vowels

    Get PDF
    This paper examines the phonetic correlates of the (phonological) vowel length contrast in Kyrgyz to address a range of questions about the nature of this contrast, and also explores factors that affect (phonetic) duration in short vowels. Measurement and analysis of the vowels confirms that there is indeed a significant duration distinction between the Kyrgyz vowel categories referred to as short and long vowels. Preliminary midpoint formant measurements show that there may be some accompanying spectral component to the length contrast for certain vowels, but findings are not conclusive. A comparison of F0 dynamics and spectral dynamics through long and short vowels does not yield evidence that some long vowels may in fact be two heterosyllabic short vowels. Analysis shows that duration is associated with a vowel’s presence in word-edge syllables in Kyrgyz, as anticipated based on descriptions of word-final stress and initial prominence. However, high vowels and non-high vowels are found to consistently exhibit opposite durational effects. Specifically, high vowels in word-edge syllables are longer than high vowels in medial syllables, while non-high vowels in word-edge syllables are shorter than non-high vowels in medial syllables. This suggests either a phenomenon of durational neutralisation at word edges or the exaggeration of durational differences word-medially, and is not taken as a case of word-edge strengthening. Proposals for how to select from between these hypotheses in future work are discussed

    Machine Translation for Crimean Tatar to Turkish

    Get PDF
    In this paper a machine translation system for Crimean Tatar to Turkish is presented. To our knowledge this is the first Machine Translation system made available for public use for Crimean Tatar, and the first such system released as free and open source software. The system was built using Apertium, a free and open source machine translation system, and is currently unidirectional from Crimean Tatar to Turkish. We describe our translation system, evaluate it on parallel corpora and compare its performance with a Neural Machine Translation system, trained on the limited amount of corpora available

    Apertium’s Web Toolchain For Low-Resource Language Technology

    Get PDF
    The Apertium web toolchain, consisting of a front end (Apertium HTML-Tools) and a back end (Apertium APy), is a free and open-source toolchain that supports a range of open-source technologies. The internationalised interface allows users to translate text, documents, and web pages, as well as morphologically analyse and generate text. Other features, including support for multi-step/pivot translation, dictionary-style lookup, spell-checking, and accepting user suggestions for translations, are nearing release

    Rule-Based Machine Translation From Kazakh To Turkish

    Get PDF
    This paper presents a shallow-transfer machine translation (MT) system for translating from Kazakh to Turkish. Background on the differences between the languages is presented, followed by how the system was designed to handle some of these differences. The system is based on the Apertium free/open-source machine translation platform. The structure of the system and how it works is described, along with an evaluation against two competing systems. Linguistic components were developed, including a Kazakh-Turkish bilingual dictionary, Constraint Grammar disambiguation rules, lexical selection rules, and structural transfer rules. With many known issues yet to be addressed, our RBMT system has reached performance comparable to publicly-available corpus-based MT systems between the languages
    corecore